- 
                Notifications
    
You must be signed in to change notification settings  - Fork 4.2k
 
update node info processors to include unschedulable nodes #8520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update node info processors to include unschedulable nodes #8520
Conversation
| 
           i'm working on adding more unit tests for this behavior, but i wanted to share this solution so we could start talking about it.  | 
    
a0ebb28    to
    3270172      
    Compare
  
    | 
           i've rewritten this patch to use all nodes as the secondary value instead of using a new list of ready unschedulable nodes.  | 
    
| 
           i need to do a little more testing on this locally, but i think this is fine for review.  | 
    
| // Last resort - unready/unschedulable nodes. | ||
| for _, node := range nodes { | ||
| // we want to check not only the ready nodes, but also ready unschedulable nodes. | ||
| for _, node := range append(nodes, allNodes...) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not sure that this is appropriate to append these. theoretically the allNodes should already contain nodes. i'm going to test this out using just allNodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
due to filtering that happens in obtainNodeLists, we need to combine both lists of nodes here.
3270172    to
    cb2649a      
    Compare
  
    | 
           i updated the argument names in the   | 
    
| 
           it seems like the update to the mixed node processor needs a little more investigation.  | 
    
cb2649a    to
    fd53c0b      
    Compare
  
    | 
           it looks like we need both the   | 
    
fd53c0b    to
    906a939      
    Compare
  
    | 
           rebased  | 
    
| 
           @jackfrancis @towca any chance at a review here?  | 
    
        
          
                cluster-autoscaler/processors/nodeinfosprovider/mixed_nodeinfos_processor.go
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | 
           refactored to put the unschedulable flag clearing behind a flag. i'm not totally happy with this solution as it feels a little sneaky to add the boolean value to the   | 
    
c1f4a88    to
    57994b4      
    Compare
  
    | newNode.Labels[apiv1.LabelHostname] = newName | ||
| 
               | 
          ||
| if taintConfig != nil && taintConfig.ShouldIgnoreNodeUnschedulable() { | ||
| newNode.Spec.Unschedulable = false | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: IMO it'd read a bit better if the ifs were nested.
| balancingIgnoreLabelsFlag = multiStringFlag("balancing-ignore-label", "Specifies a label to ignore in addition to the basic and cloud-provider set of labels when comparing if two node groups are similar") | ||
| balancingLabelsFlag = multiStringFlag("balancing-label", "Specifies a label to use for comparing if two node groups are similar, rather than the built in heuristics. Setting this flag disables all other comparison logic, and cannot be combined with --balancing-ignore-label.") | ||
| awsUseStaticInstanceList = flag.Bool("aws-use-static-instance-list", false, "Should CA fetch instance types in runtime or use a static list. AWS only") | ||
| ignoreNodeUnschedulable = flag.Bool("ignore-node-unschedulable", false, "Specifies that the CA should ignore a node's .spec.unschedulable field in node templates when considering to scale a node group.") | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: IMO the flag/option name is pretty vague and you need the description to understand it. On the other hand I can't think of a different name that isn't horribly verbose (something like --assume-template-node-always-schedulable comes to mind) 😅
@jackfrancis WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably prefer something more "positive" like scaleFromUnschedulable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that sound fine to me, --scale-from-unschedulable be an acceptable flag then?
| 
           Thanks a lot for all the work on this PR @elmiko! 
 IMO this is perfectly fine, the  LGTM, holding for other reviewers. /lgtm  | 
    
| 
           Missed the /approve  | 
    
| 
           thanks @towca , i'm happy to adjust the names to make it easier to understand.  | 
    
This change introduces a flag which will instruct the CA to ignore a node's `.spec.unschedulable` field when creating node template for considering which node group to scale.
57994b4    to
    4c4511b      
    Compare
  
    | 
           updated to rename the flag, and variables,   | 
    
| 
           /label tide-merge-method-squash thanks @elmiko!  | 
    
| 
           @jackfrancis: The label(s)  In response to this: 
 Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.  | 
    
| 
           [APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: elmiko, jackfrancis, towca The full list of commands accepted by this bot can be found here. The pull request process is described here 
Needs approval from an approver in each of these files:
 
      
 Approvers can indicate their approval by writing   | 
    
| 
           /label tide/merge-method-squash  | 
    
…nodes (kubernetes#8520) * pass allNodes to node info provider Process This change passes all the nodes to the mixed node info provider processor that is called from `RunOnce`. The change is to allow unschedulable and unready nodes to be processed as bad canidates during the node info template generation. The Process function has been updated to separate nodes into good and bad candidates to make the filtering match the original intent. * add --scale-from-unschedulable flag This change introduces a flag which will instruct the CA to ignore a node's `.spec.unschedulable` field when creating node template for considering which node group to scale.
…nodes (kubernetes#8520) * pass allNodes to node info provider Process This change passes all the nodes to the mixed node info provider processor that is called from `RunOnce`. The change is to allow unschedulable and unready nodes to be processed as bad canidates during the node info template generation. The Process function has been updated to separate nodes into good and bad candidates to make the filtering match the original intent. * add --scale-from-unschedulable flag This change introduces a flag which will instruct the CA to ignore a node's `.spec.unschedulable` field when creating node template for considering which node group to scale.
…nodes (kubernetes#8520) * pass allNodes to node info provider Process This change passes all the nodes to the mixed node info provider processor that is called from `RunOnce`. The change is to allow unschedulable and unready nodes to be processed as bad canidates during the node info template generation. The Process function has been updated to separate nodes into good and bad candidates to make the filtering match the original intent. * add --scale-from-unschedulable flag This change introduces a flag which will instruct the CA to ignore a node's `.spec.unschedulable` field when creating node template for considering which node group to scale.
…nodes (kubernetes#8520) * pass allNodes to node info provider Process This change passes all the nodes to the mixed node info provider processor that is called from `RunOnce`. The change is to allow unschedulable and unready nodes to be processed as bad canidates during the node info template generation. The Process function has been updated to separate nodes into good and bad candidates to make the filtering match the original intent. * add --scale-from-unschedulable flag This change introduces a flag which will instruct the CA to ignore a node's `.spec.unschedulable` field when creating node template for considering which node group to scale.
What type of PR is this?
/kind bug
What this PR does / why we need it:
This PR adds a new lister for ready unschedulable nodes, it also connects that lister to a new parameter in the node info processors
Processfunction. This change enables the autoscaler to use unschedulable, but otherwise ready, nodes as a last resort when creating node templates for scheduling simulation.Which issue(s) this PR fixes:
Fixes #8380
Special notes for your reviewer:
I'm not sure if this is the best way to solve this problem, but i am proposing this for further discussion and design.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: